Barinas State
Applications and Advances of Artificial Intelligence in Music Generation:A Review
Chen, Yanxu, Huang, Linshu, Gou, Tian
In recent years, artificial intelligence (AI) has made significant progress in the field of music generation, driving innovation in music creation and applications. This paper provides a systematic review of the latest research advancements in AI music generation, covering key technologies, models, datasets, evaluation methods, and their practical applications across various fields. The main contributions of this review include: (1) presenting a comprehensive summary framework that systematically categorizes and compares different technological approaches, including symbolic generation, audio generation, and hybrid models, helping readers better understand the full spectrum of technologies in the field; (2) offering an extensive survey of current literature, covering emerging topics such as multimodal datasets and emotion expression evaluation, providing a broad reference for related research; (3) conducting a detailed analysis of the practical impact of AI music generation in various application domains, particularly in real-time interaction and interdisciplinary applications, offering new perspectives and insights; (4) summarizing the existing challenges and limitations of music quality evaluation methods and proposing potential future research directions, aiming to promote the standardization and broader adoption of evaluation techniques. Through these innovative summaries and analyses, this paper serves as a comprehensive reference tool for researchers and practitioners in AI music generation, while also outlining future directions for the field.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- South America > Venezuela > Barinas State > Barinas (0.04)
- North America > United States > New York (0.04)
- (17 more...)
- Research Report (1.00)
- Overview (1.00)
- Media > Music (1.00)
- Leisure & Entertainment > Games > Computer Games (1.00)
Colour and Brush Stroke Pattern Recognition in Abstract Art using Modified Deep Convolutional Generative Adversarial Networks
Srinivasan, Srinitish, Pathak, Varenya
Abstract Art is an immensely popular, discussed form of art that often has the ability to depict the emotions of an artist. Many researchers have made attempts to study abstract art in the form of edge detection, brush stroke and emotion recognition algorithms using machine and deep learning. This papers describes the study of a wide distribution of abstract paintings using Generative Adversarial Neural Networks(GAN). GANs have the ability to learn and reproduce a distribution enabling researchers and scientists to effectively explore and study the generated image space. However, the challenge lies in developing an efficient GAN architecture that overcomes common training pitfalls. This paper addresses this challenge by introducing a modified-DCGAN (mDCGAN) specifically designed for high-quality artwork generation. The approach involves a thorough exploration of the modifications made, delving into the intricate workings of DCGANs, optimisation techniques, and regularisation methods aimed at improving stability and realism in art generation enabling effective study of generated patterns. The proposed mDCGAN incorporates meticulous adjustments in layer configurations and architectural choices, offering tailored solutions to the unique demands of art generation while effectively combating issues like mode collapse and gradient vanishing. Further this paper explores the generated latent space by performing random walks to understand vector relationships between brush strokes and colours in the abstract art space and a statistical analysis of unstable outputs after a certain period of GAN training and compare its significant difference. These findings validate the effectiveness of the proposed approach, emphasising its potential to revolutionise the field of digital art generation and digital art ecosystem.
- South America > Venezuela > Barinas State > Barinas (0.04)
- North America > United States > Illinois (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
Neural-Base Music Generation for Intelligence Duplication
There are two aspects of machine learning and artificial intelligence: (1) interpreting information, and (2) inventing new useful information. Much advance has been made for (1) with a focus on pattern recognition techniques (e.g., interpreting visual data). This paper focuses on (2) with intelligent duplication (ID) for invention. We explore the possibility of learning a specific individual's creative reasoning in order to leverage the learned expertise and talent to invent new information. More specifically, we employ a deep learning system to learn from the great composer Beethoven and capture his composition ability in a hash-based knowledge base. This new form of knowledge base provides a reasoning facility to drive the music composition through a novel music generation method.
- South America > Venezuela > Barinas State > Barinas (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Florida > Orange County > Orlando (0.04)
- Research Report (0.64)
- Questionnaire & Opinion Survey (0.46)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
MSight: An Edge-Cloud Infrastructure-based Perception System for Connected Automated Vehicles
Zhang, Rusheng, Meng, Depu, Shen, Shengyin, Zou, Zhengxia, Li, Houqiang, Liu, Henry X.
As vehicular communication and networking technologies continue to advance, infrastructure-based roadside perception emerges as a pivotal tool for connected automated vehicle (CAV) applications. Due to their elevated positioning, roadside sensors, including cameras and lidars, often enjoy unobstructed views with diminished object occlusion. This provides them a distinct advantage over onboard perception, enabling more robust and accurate detection of road objects. This paper presents MSight, a cutting-edge roadside perception system specifically designed for CAVs. MSight offers real-time vehicle detection, localization, tracking, and short-term trajectory prediction. Evaluations underscore the system's capability to uphold lane-level accuracy with minimal latency, revealing a range of potential applications to enhance CAV safety and efficiency. Presently, MSight operates 24/7 at a two-lane roundabout in the City of Ann Arbor, Michigan.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.49)
- Asia > China > Beijing > Beijing (0.04)
- Asia > China > Anhui Province > Hefei (0.04)
- (5 more...)
- Transportation > Ground > Road (1.00)
- Information Technology (1.00)
- Automobiles & Trucks (1.00)
- Transportation > Infrastructure & Services (0.90)
Multiscale Attention via Wavelet Neural Operators for Vision Transformers
Nekoozadeh, Anahita, Ahmadzadeh, Mohammad Reza, Mardani, Zahra
Transformers have achieved widespread success in computer vision. At their heart, there is a Self-Attention (SA) mechanism, an inductive bias that associates each token in the input with every other token through a weighted basis. The standard SA mechanism has quadratic complexity with the sequence length, which impedes its utility to long sequences appearing in high resolution vision. Recently, inspired by operator learning for PDEs, Adaptive Fourier Neural Operators (AFNO) were introduced for high resolution attention based on global convolution that is efficiently implemented via FFT. However, the AFNO global filtering cannot well represent small and moderate scale structures that commonly appear in natural images. To leverage the coarse-to-fine scale structures we introduce a Multiscale Wavelet Attention (MWA) by leveraging wavelet neural operators which incurs linear complexity in the sequence size. We replace the attention in ViT with MWA and our experiments with CIFAR and Tiny-ImageNet classification demonstrate significant improvement over alternative Fourier-based attentions such as AFNO and Global Filter Network (GFN).
- Asia > Middle East > Iran > Isfahan Province > Isfahan (0.04)
- South America > Venezuela > Barinas State > Barinas (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (3 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Data Science > Data Quality > Data Transformation (0.73)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Priors in Deep Image Restoration and Enhancement: A Survey
Lu, Yunfan, Lin, Yiqi, Wu, Hao, Luo, Yunhao, Zheng, Xu, Xiong, Hui, Wang, Lin
Image restoration and enhancement is a process of improving the image quality by removing degradations, such as noise, blur, and resolution degradation. Deep learning (DL) has recently been applied to image restoration and enhancement. Due to its ill-posed property, plenty of works have been explored priors to facilitate training deep neural networks (DNNs). However, the importance of priors has not been systematically studied and analyzed by far in the research community. Therefore, this paper serves as the first study that provides a comprehensive overview of recent advancements in priors for deep image restoration and enhancement. Our work covers five primary contents: (1) A theoretical analysis of priors for deep image restoration and enhancement; (2) A hierarchical and structural taxonomy of priors commonly used in the DL-based methods; (3) An insightful discussion on each prior regarding its principle, potential, and applications; (4) A summary of crucial problems by highlighting the potential future directions, especially adopting the large-scale foundation models as prior, to spark more research in the community; (5) An open-source repository that provides a taxonomy of all mentioned works and code links.
- North America > United States (0.14)
- Asia > China > Hong Kong (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- (3 more...)
Convolutional Neural Network (CNN) to reduce construction loss in JPEG compression caused by Discrete Fourier Transform (DFT)
In recent decades, digital image processing has gained enormous popularity. Consequently, a number of data compression strategies have been put forth, with the goal of minimizing the amount of information required to represent images. Among them, JPEG compression is one of the most popular methods that has been widely applied in multimedia and digital applications. The periodic nature of DFT makes it impossible to meet the periodic condition of an image's opposing edges without producing severe artifacts, which lowers the image's perceptual visual quality. On the other hand, deep learning has recently achieved outstanding results for applications like speech recognition, image reduction, and natural language processing. Convolutional Neural Networks (CNN) have received more attention than most other types of deep neural networks. The use of convolution in feature extraction results in a less redundant feature map and a smaller dataset, both of which are crucial for image compression. In this work, an effective image compression method is purposed using autoencoders. The study's findings revealed a number of important trends that suggested better reconstruction along with good compression can be achieved using autoencoders.
- South America > Venezuela > Barinas State > Barinas (0.04)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- North America > United States > California > Santa Clara County > San Jose (0.04)
- Europe > Italy (0.04)
The First Principles of Deep Learning and Compression
The deep learning revolution incited by the 2012 Alexnet paper has been transformative for the field of computer vision. Many problems which were severely limited using classical solutions are now seeing unprecedented success. The rapid proliferation of deep learning methods has led to a sharp increase in their use in consumer and embedded applications. One consequence of consumer and embedded applications is lossy multimedia compression which is required to engineer the efficient storage and transmission of data in these real-world scenarios. As such, there has been increased interest in a deep learning solution for multimedia compression which would allow for higher compression ratios and increased visual quality. The deep learning approach to multimedia compression, so called Learned Multimedia Compression, involves computing a compressed representation of an image or video using a deep network for the encoder and the decoder. While these techniques have enjoyed impressive academic success, their industry adoption has been essentially non-existent. Classical compression techniques like JPEG and MPEG are too entrenched in modern computing to be easily replaced. This dissertation takes an orthogonal approach and leverages deep learning to improve the compression fidelity of these classical algorithms. This allows the incredible advances in deep learning to be used for multimedia compression without threatening the ubiquity of the classical methods. The key insight of this work is that methods which are motivated by first principles, i.e., the underlying engineering decisions that were made when the compression algorithms were developed, are more effective than general methods. By encoding prior knowledge into the design of the algorithm, the flexibility, performance, and/or accuracy are improved at the cost of generality...
- South America > Peru (0.13)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- North America > United States > Maryland > Prince George's County > College Park (0.04)
- (8 more...)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Summary/Review (0.87)
- Education (1.00)
- Information Technology (0.92)
- Leisure & Entertainment (0.92)
- (2 more...)
Blur, Noise, and Compression Robust Generative Adversarial Networks
Kaneko, Takuhiro, Harada, Tatsuya
Recently, generative adversarial networks (GANs), which learn data distributions through adversarial training, have gained special attention owing to their high image reproduction ability. However, one limitation of standard GANs is that they recreate training images faithfully despite image degradation characteristics such as blur, noise, and compression. To remedy this, we address the problem of blur, noise, and compression robust image generation. Our objective is to learn a non-degraded image generator directly from degraded images without prior knowledge of image degradation. The recently proposed noise robust GAN (NR-GAN) already provides a solution to the problem of noise degradation. Therefore, we first focus on blur and compression degradations. We propose blur robust GAN (BR-GAN) and compression robust GAN (CR-GAN), which learn a kernel generator and quality factor generator, respectively, with non-degraded image generators. Owing to the irreversible blur and compression characteristics, adjusting their strengths is non-trivial. Therefore, we incorporate switching architectures that can adapt the strengths in a data-driven manner. Based on BR-GAN, NR-GAN, and CR-GAN, we further propose blur, noise, and compression robust GAN (BNCR-GAN), which unifies these three models into a single model with additionally introduced adaptive consistency losses that suppress the uncertainty caused by the combination. We provide benchmark scores through large-scale comparative studies on CIFAR-10 and a generality analysis on FFHQ dataset.
- South America > Venezuela > Barinas State > Barinas (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)
No Multiplication? No Floating Point? No Problem! Training Networks for Efficient Inference
Baluja, Shumeet, Marwood, David, Covell, Michele, Johnston, Nick
A different body of research has focused on quantizing and clustering network weights (Yi et al., 2008; Courbariaux et al., 2016; Rastegari et al., 2016; Deng et al., 2017; Wu et al., 2018). For successful deployment of deep neural networks on highly resource constrained devices (hearing aids, earbuds, wearables), we must simplify the types of operations and the memory/power resources required during inference. Completely avoiding inference-time floating point operations is one of the simplest ways to design networks for these highly constrained environments. By quantizing both our in-network non-linearities and our network weights, we can move to simple, compact networks without floating point operations, without multiplications, and without nonlinear function computations. Our approach allows us to explore the spectrum of possible networks, ranging from fully continuous versions down to networks with bi-level weights and activations. Our results show that quantization can be done with little or no loss of performance on both regression tasks (auto-encoding) and multi-class classification tasks (ImageNet). The memory needed to deploy our quantized networks is less than one-third of the equivalent architecture that uses floating-point operations. The activations in our networks emit only a small number of predefined, quantized values (typically 32) and all of the network's weight are drawn from a small number of unique values (typically 100-1000) found by employing a novel periodic adaptive clustering step during training. Almost all recent neural-network training algorithms rely on gradient-based learning. This has moved the research field away from using discrete-valued inference, with hard thresholds, to smooth, continuous-valued activation functions (Werbos, 1974; Rumelhart et al., 1986). Unfortunately, this causes inference to be done with floating-point operations, making it difficult to deploy on an increasinglylarge set of low-cost, limited-memory, low-power hardware in both commercial (Lane et al., 2015) and research settings (Bourzac, 2017). Avoiding all floating point operations allows the inference network to realize the power-saving gains available with fixed-point processing (Finnerty & Ratigner, 2017).
- South America > Venezuela > Barinas State > Barinas (0.04)
- North America > United States > California > Santa Clara County > Mountain View (0.04)